home *** CD-ROM | disk | FTP | other *** search
- THE THEORY OF OPERATION OF SPP
-
- SPP is a strange combination of a one pass C compiler and a file copying
- program. With the exception of about 20 lines of code, the two programs are
- completely independent!
-
- The job of the compiler is to know the types of all functions and variables,
- to insert entry macros before the first executable statement of a function, to
- replace return instructions by exit macros and to insert exit macros at the
- end of every function if flow could fall off the end of the function. In
- order to do this, all include files must be processed and all macros must be
- correctly expanded.
-
- The job of the file copying program is to copy all the characters of the main
- file while allowing Sherlock macros to be inserted and return statements to be
- revised. Characters from included files are not copied and an option is
- provided to eliminate unnecessary white space, including comments. The file
- copier is bullet proof: it will, for instance, correctly copy a Pascal program
- even though the parser becomes (as it should become) hopelessly confused.
-
- The compiler is based on a recursive descent parser, found in PAR.C, DCL.C and
- EXP.C. The only complication here is that the parser accepts both K&R C and
- ANSI Standard C. Oh yes, the parser also accepts some Borland and Microsoft
- extensions. To add your own extensions, change the function allow_mkeys.
-
- The parser calls the get_token() routine to get a stream of C tokens.
- The get_token routine handles all preprocessor directives and macro expansion
- transparently to the parser. This has always been the part of SPP where the
- most real bugs lurked. Be warned. The files SPP.C, DEF.C, DIR.C, TOK.C and
- UTL.C contain the heart of the token processing routines. The file MST.C
- contains the macro symbol table used to record and expand macro definitions.
-
- The parser calls semantic routines in SEM.C. There are three kinds of
- semantic routines:
-
- 1. The functions whose names start with sd_ (for semantic declaration) keep
- track of the declared types of all functions and variables. The file ST.C
- contains the symbol table used by the semantic routines.
-
- 2. The functions whose names start with sf_ (for semantic flow) keep track of
- the possible flow of control through the executable statements in a function.
- These functions maintain a "flow stack" telling whether flow could ever fall
- through the end of the current statement.
-
- 3. The functions whose names start with so_ (for semantic output) create the
- additional macro calls that are generated by SPP. Because of the one pass
- nature of SPP, calling these routines at exactly the right time is tricky.
-
- Header files: The values of tokens used throughout SPP are defined in enum.h.
- The definitions of are global variables are found in glb.h. The prototypes of
- all globally visible functions are found in tmp.h. All other constants are
- found in spp.h.
-
- SPP is designed in separate modules. Variables and constants that are used
- only by a single module are declared static so that they are invisible outside
- the file in which they are defined. The partitioning of SPP into separate
- files was intended to reduce the number of global variables. In my opinion,
- this aspect of the design of SPP has been a complete success.
-
- The file SYS.C contains low level routines that might have to be rewritten
- should SPP be ported to another machine or operating system. Even in the days
- of the ANSI standard, the SYS.C module is quite useful.
-
- One of these low level routines is sysnext(), which returns the next character
- from the current input file. This is the most often called routine in the
- whole program, and has been tweaked to make it run fast.
-
- The file copier program operates inside sysnext(), at the lowest level of the
- whole program. As characters are returned from sysnext(), they are placed in
- the hold buffer. Characters are not placed in the hold buffer if they come
- from an included file or from a macro expansion.
-
- The hold buffer is output when syshflush() is called, or emptied (either
- partially or completely) without being output when syshkill() or syshnldel()
- are called. These calls to syshflush(), syshkill() and syshnldel() form the
- interface between the file copier and the parser. Although only 11 calls to
- these routines appear throughout the parser, getting these 11 calls right is
- tricky--syshflush() must be called often enough so that the hold buffer never
- overflows, but syshflush() must never be called when a Sherlock macro may
- have to be output, since either syshkill() or sysnldel() would be called
- instead.
-